智能论文笔记

SMPL: Simulated Industrial Manufacturing and Process Control Learning Environments

Mohan Zhang , Xiaozhou Wang , Benjamin Decardi-Nelson , Bo Song , An Zhang , Jinfeng Liu , Sile Tao , Jiayi Cheng , Xiaohong Liu , DengDeng Yu

分类：机器学习

2022-06-17

传统的生物和制药工厂由人类工人或预定义阈值控制。现代化的工厂具有高级过程控制算法，例如模型预测控制（MPC）。但是，几乎没有探索将深入的增强学习来控制制造厂。原因之一是缺乏高保真模拟和基准测试的标准API。为了弥合这一差距，我们开发了一个易于使用的库，其中包括五个高保真模拟环境：BeerfMtenV，Reactorenv，Atropineenv，Pensimenv和Mabenv，涵盖了广泛的制造过程。我们在已发布的动态模型上构建这些环境。此外，我们在线和离线基准基准，基于模型和无模型的强化学习算法，用于比较后续研究。

translated by 谷歌翻译

DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks

Yonggan Fu , Haichuan Yang , Jiayi Yuan , Meng Li , Cheng Wan , Raghuraman Krishnamoorthi , Vikas Chandra , Yingyan Lin

分类：机器学习 | 计算机视觉

2022-06-02

有效的深层神经网络（DNN）模型配备了紧凑的操作员（例如，深度卷积）在降低DNN的理论复杂性（例如，权重/操作总数）的同时，在保持体面的模型准确性的同时，显示出很大的潜力。但是，由于其通常采用的紧凑型操作员的低硬件利用率，现有的有效DNN仍然受到履行其提高现实硬件效率的承诺的限制。在这项工作中，我们为开发真实硬件有效的DNN开辟了新的压缩范式，从而提高了硬件效率，同时保持模型的准确性。有趣的是，我们观察到，尽管某些DNN层的激活功能有助于DNNS的训练优化和可实现的准确性，但在训练后可以正确删除它们，而不会损害模型的准确性。受到这一观察的启发，我们提出了一个称为DepthShrinker的框架，该框架通过缩小现有有效DNN的基本构建块来开发硬件友好的紧凑型网络，这些构件具有不规则的计算模式，并具有大量改进的硬件利用率，从而将硬件的计算模式缩小到密集的情况下。令人兴奋的是，我们的DepthShrinker框架提供了硬件友好的紧凑网络，既优于最先进的有效DNN和压缩技术方法元元素。我们的代码可在以下网址找到：https：//github.com/facebookresearch/depthshrinker。

translated by 谷歌翻译

Spatial Autoregressive Coding for Graph Neural Recommendation

Jiayi Zheng , Ling Yang , Heyuan Wang , Cheng Yang , Yinghong Li , Xiaowei Hu , Shenda Hong

分类：人工智能 | 机器学习

2022-05-19

包括传统浅层模型和深图神经网络（GNN）在内的图形嵌入方法已导致有希望的应用。然而，由于其优化范式，浅层模型尤其是基于随机步行的算法无法充分利用采样子图或序列中的邻居接近度。基于GNN的算法遇到了高阶信息的利用不足，在堆叠过多的层时很容易引起过度平滑的问题，这可能会恶化低度（长尾）项目的建议，从而限制了表现力和可伸缩性。在本文中，我们提出了一个新颖的框架SAC，即空间自动回归编码，以统一的方式解决上述问题。为了充分利用邻居接近和高级信息，我们设计了一种新型的空间自回旋范式。具体而言，我们首先随机掩盖了多跳的邻居，并通过以明确的多跳上注意来整合所有其他周围的邻居来嵌入目标节点。然后，我们加强模型，通过对比编码和蒙面邻居的嵌入来学习目标节点的邻居预测性编码，并配备了新的硬性阴性采样策略。为了了解目标到邻居预测任务的最小足够表示并删除邻居的冗余，我们通过最大化目标预测性编码和蒙面邻居的嵌入以及同时约束编码之间的相互信息来设计邻居信息瓶颈和周围的邻居的嵌入。公共推荐数据集和实际方案网络规模数据集Douyin-Friend-Recormendation的实验结果证明了SAC的优势与最先进的方法相比。

translated by 谷歌翻译

PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation

Xiangtai Li , Shilin Xu , Yibo Yang , Haobo Yuan , Guangliang Cheng , Yunhai Tong , Zhouchen Lin , Dacheng Tao

分类：计算机视觉

2023-01-03

Panoptic Part Segmentation (PPS) unifies panoptic segmentation and part segmentation into one task. Previous works utilize separated approaches to handle thing, stuff, and part predictions without shared computation and task association. We aim to unify these tasks at the architectural level, designing the first end-to-end unified framework named Panoptic-PartFormer. Moreover, we find the previous metric PartPQ biases to PQ. To handle both issues, we make the following contributions: Firstly, we design a meta-architecture that decouples part feature and things/stuff feature, respectively. We model things, stuff, and parts as object queries and directly learn to optimize all three forms of prediction as a unified mask prediction and classification problem. We term our model as Panoptic-PartFormer. Secondly, we propose a new metric Part-Whole Quality (PWQ) to better measure such task from both pixel-region and part-whole perspectives. It can also decouple the error for part segmentation and panoptic segmentation. Thirdly, inspired by Mask2Former, based on our meta-architecture, we propose Panoptic-PartFormer++ and design a new part-whole cross attention scheme to further boost part segmentation qualities. We design a new part-whole interaction method using masked cross attention. Finally, the extensive ablation studies and analysis demonstrate the effectiveness of both Panoptic-PartFormer and Panoptic-PartFormer++. Compared with previous Panoptic-PartFormer, our Panoptic-PartFormer++ achieves 2% PartPQ and 3% PWQ improvements on the Cityscapes PPS dataset and 5% PartPQ on the Pascal Context PPS dataset. On both datasets, Panoptic-PartFormer++ achieves new state-of-the-art results with a significant cost drop of 70% on GFlops and 50% on parameters. Our models can serve as a strong baseline and aid future research in PPS. Code will be available.

translated by 谷歌翻译

Ranking Differential Privacy

Shirong Xu , Will Wei Sun , Guang Cheng

分类： (统计)机器学习 | 机器学习

2023-01-02

Rankings are widely collected in various real-life scenarios, leading to the leakage of personal information such as users' preferences on videos or news. To protect rankings, existing works mainly develop privacy protection on a single ranking within a set of ranking or pairwise comparisons of a ranking under the $\epsilon$-differential privacy. This paper proposes a novel notion called $\epsilon$-ranking differential privacy for protecting ranks. We establish the connection between the Mallows model (Mallows, 1957) and the proposed $\epsilon$-ranking differential privacy. This allows us to develop a multistage ranking algorithm to generate synthetic rankings while satisfying the developed $\epsilon$-ranking differential privacy. Theoretical results regarding the utility of synthetic rankings in the downstream tasks, including the inference attack and the personalized ranking tasks, are established. For the inference attack, we quantify how $\epsilon$ affects the estimation of the true ranking based on synthetic rankings. For the personalized ranking task, we consider varying privacy preferences among users and quantify how their privacy preferences affect the consistency in estimating the optimal ranking function. Extensive numerical experiments are carried out to verify the theoretical results and demonstrate the effectiveness of the proposed synthetic ranking algorithm.

translated by 谷歌翻译

Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

Jianzong Wu , Xiangtai Li , Henghui Ding , Xia Li , Guangliang Cheng , Yunhai Tong , Chen Change Loy

分类：计算机视觉

2023-01-02

In this work, we focus on instance-level open vocabulary segmentation, intending to expand a segmenter for instance-wise novel categories without mask annotations. We investigate a simple yet effective framework with the help of image captions, focusing on exploiting thousands of object nouns in captions to discover instances of novel classes. Rather than adopting pretrained caption models or using massive caption datasets with complex pipelines, we propose an end-to-end solution from two aspects: caption grounding and caption generation. In particular, we devise a joint Caption Grounding and Generation (CGG) framework based on a Mask Transformer baseline. The framework has a novel grounding loss that performs explicit and implicit multi-modal feature alignments. We further design a lightweight caption generation head to allow for additional caption supervision. We find that grounding and generation complement each other, significantly enhancing the segmentation performance for novel categories. We conduct extensive experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS). The results demonstrate the superiority of our CGG framework over previous OVIS methods, achieving a large improvement of 6.8% mAP on novel classes without extra caption data. Our method also achieves over 15% PQ improvements for novel classes on the OSPS benchmark under various settings.

translated by 谷歌翻译

Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding

Jiahao Zhu , Daizong Liu , Pan Zhou , Xing Di , Yu Cheng , Song Yang , Wenzheng Xu , Zichuan Xu , Yao Wan , Lichao Sun

分类：计算机视觉

2023-01-02

Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning. However, we argue that these methods have overlooked two indispensable issues: 1) Boundary-bias: The annotated target segment generally refers to two specific frames as corresponding start and end timestamps. The video downsampling process may lose these two frames and take the adjacent irrelevant frames as new boundaries. 2) Reasoning-bias: Such incorrect new boundary frames also lead to the reasoning bias during frame-query interaction, reducing the generalization ability of model. To alleviate above limitations, in this paper, we propose a novel Siamese Sampling and Reasoning Network (SSRN) for TSG, which introduces a siamese sampling mechanism to generate additional contextual frames to enrich and refine the new boundaries. Specifically, a reasoning strategy is developed to learn the inter-relationship among these frames and generate soft labels on boundaries for more accurate frame-query reasoning. Such mechanism is also able to supplement the absent consecutive visual semantics to the sampled sparse frames for fine-grained activity understanding. Extensive experiments demonstrate the effectiveness of SSRN on three challenging datasets.

translated by 谷歌翻译

Diffusion Model based Semi-supervised Learning on Brain Hemorrhage Images for Efficient Midline Shift Quantification

Shizhan Gong , Cheng Chen , Yuqi Gong , Nga Yan Chan , Wenao Ma , Calvin Hoi-Kwan Mak , Jill Abrigo , Qi Dou

分类：计算机视觉 | 人工智能

2023-01-01

Brain midline shift (MLS) is one of the most critical factors to be considered for clinical diagnosis and treatment decision-making for intracranial hemorrhage. Existing computational methods on MLS quantification not only require intensive labeling in millimeter-level measurement but also suffer from poor performance due to their dependence on specific landmarks or simplified anatomical assumptions. In this paper, we propose a novel semi-supervised framework to accurately measure the scale of MLS from head CT scans. We formulate the MLS measurement task as a deformation estimation problem and solve it using a few MLS slices with sparse labels. Meanwhile, with the help of diffusion models, we are able to use a great number of unlabeled MLS data and 2793 non-MLS cases for representation learning and regularization. The extracted representation reflects how the image is different from a non-MLS image and regularization serves an important role in the sparse-to-dense refinement of the deformation field. Our experiment on a real clinical brain hemorrhage dataset has achieved state-of-the-art performance and can generate interpretable deformation fields.

translated by 谷歌翻译

CORGI-PM: A Chinese Corpus For Gender Bias Probing and Mitigation

Ge Zhang , Yizhi Li , Yaoyao Wu , Linyuan Zhang , Chenghua Lin , Jiayi Geng , Shi Wang , Jie Fu

分类：自然语言处理 | 人工智能 | 机器学习

2023-01-01

As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.

translated by 谷歌翻译

A novel cluster internal evaluation index based on hyper-balls

Jiang Xie , Pengfei Zhao , Shuyin Xia , Guoyin Wang , Dongdong Cheng

分类：机器学习 | 人工智能

2022-12-30

It is crucial to evaluate the quality and determine the optimal number of clusters in cluster analysis. In this paper, the multi-granularity characterization of the data set is carried out to obtain the hyper-balls. The cluster internal evaluation index based on hyper-balls(HCVI) is defined. Moreover, a general method for determining the optimal number of clusters based on HCVI is proposed. The proposed methods can evaluate the clustering results produced by the several classic methods and determine the optimal cluster number for data sets containing noises and clusters with arbitrary shapes. The experimental results on synthetic and real data sets indicate that the new index outperforms existing ones.

translated by 谷歌翻译